-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
delegated operator fixes #4908
delegated operator fixes #4908
Conversation
WalkthroughThe changes in this pull request focus on the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
@@ -312,7 +312,7 @@ def update_run_state( | |||
|
|||
doc = self._collection.find_one_and_update( | |||
filter={"_id": _id}, | |||
update=[update], | |||
update=update, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be an array (aggregation pipeline). Otherwise passing a dictionary will be interpreted as the new doc so you'd essentially be setting the doc to have a new field "$set" with value equal to the dictionary value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may be thinking of find_one_and_replace()
. This change works and tests pass.
It's more standard to pass a dict with the updates. It's possible, though uncommon, to use an agg pipeline, but I didn't see why that was necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively if you don't specify $set
and pass a replacement document instead, then it would behave as you say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.mongodb.com/docs/manual/reference/method/db.collection.findOneAndUpdate/
I'm just going off of the mongo docs...
Since the change didn't require any corresponding test changes, are there tests that would catch this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah the alternative is to remove set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to go off of the pymongo docs here since that's what we use. https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html#pymongo.collection.Collection.find_one_and_update
I noticed that if you call find_one_and_update
it actually uses the deprecated MongoDB function findAndModify
.
https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/synchronous/collection.py#L3561
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes the test test_updates_progress
tests for this. After talking with Ibrahim I am also making an update and adding a test for a specific corner case that's broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
tests/unittests/delegated_operators_tests.py (1)
399-427
: LGTM! Well-structured test for handling null metadata.The new test method
test_output_schema_null_metadata
is a valuable addition. It effectively covers the edge case of handling null metadata when completing an operation with anoutputs_schema
. The test structure is clear and the assertions are comprehensive.Consider adding a brief comment explaining the purpose of setting metadata to null, as it might not be immediately clear to other developers why this step is necessary:
# Simulate a scenario where metadata is explicitly set to null self.svc._repo._collection.find_one_and_update( {"_id": bson.ObjectId(doc.id)}, {"$set": {"metadata": None}} )
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
- fiftyone/factory/repos/delegated_operation.py (3 hunks)
- tests/unittests/delegated_operators_tests.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fiftyone/factory/repos/delegated_operation.py
🧰 Additional context used
📓 Learnings (1)
📓 Common learnings
Learnt from: swheaton PR: voxel51/fiftyone#4893 File: fiftyone/factory/repos/delegated_operation.py:328-328 Timestamp: 2024-10-09T02:11:07.857Z Learning: In the `MongoDelegatedOperationRepo.update_run_state` method, when calling `find_one_and_update`, pass the `update` parameter directly as a dictionary, not wrapped in a list.
@@ -310,9 +313,15 @@ def update_run_state( | |||
update["$set"]["status"] = progress | |||
update["$set"]["status"]["updated_at"] = datetime.utcnow() | |||
|
|||
# Using pipeline update instead of a single update doc fixes a case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the update can either be a pipeline or a single update, why do we need to only make it a pipeline under certain conditions and not just always like the current code? Is there a different bug that can happen by consistently using pipeline for all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nonstandard. I'd prefer to use the standard approach and deviate only when necessary.
7b4e36d
to
a254324
Compare
@imanjra @kaixi-wang please re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
tests/unittests/operators/delegated_tests.py (1)
488-516
: LGTM! Consider adding a docstring.The test implementation thoroughly verifies the handling of null metadata in operations. It's a good addition that covers an important edge case.
Add a docstring to improve documentation:
def test_output_schema_null_metadata( self, mock_get_operator, mock_operator_exists ): + """ + Test that outputs_schema is correctly set in metadata when the metadata + field is explicitly null (not just unset). + """ mock_outputs = MockOutputs() doc = self.svc.queue_operation(
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
- fiftyone/factory/repos/delegated_operation.py (3 hunks)
- tests/unittests/operators/delegated_tests.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fiftyone/factory/repos/delegated_operation.py
🧰 Additional context used
🔇 Additional comments (1)
tests/unittests/operators/delegated_tests.py (1)
14-14
: LGTM!The addition of the
bson
import is appropriate for handling ObjectId operations in the new test method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What changes are proposed in this pull request?
Addresses some weirdness introduced in #4436.
if
on line 272 should be indented to be under theCOMPLETED
case. Otherwise the if statement flow is goofy and should be reworked anyways.update
param is unusual, though legal. I'm wondering what was intended with the addition of the$ifNull
? As written, I don't think it does much because theoutput_schema
is fixed.How is this patch tested? If it is not, please explain why.
test still passes
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
New Features
Bug Fixes
outputs_schema
to prevent unintended document creation.Tests
outputs_schema
.